Constructing of an Ontology-based Lexicon for Bulgarian

نویسندگان

  • Kiril Ivanov Simov
  • Petya Osenova
چکیده

In this paper we report on the progress in the creation of an Ontology-based lexicon for Bulgarian. We have started with the concept set from an upper ontology (DOLCE). Then it was extended with concepts selected from the OntoWordNet, which correspond to Core WordNet and EuroWordNet Basic concepts. The underlying idea behind the ontology-based lexicon is its organization via two semantic relations equivalence and subsumption. These relations reflect the distribution of lexical unit senses with respect to the concepts in the ontology. The lexical unit candidates for concept mapping have been selected from two large and well-developed lexical resources for Bulgarian a machine readable explanatory dictionary and a morphological lexicon. In the initial step, the lexical units were handled that have equivalent senses to the concepts in the ontology (2500 at the moment). Then, in the second stage, we are proceeding with lexical units selected on their frequency distribution in a large Bulgarian corpus. This step is the more challenging one, since it might require also additions of concepts to the ontology. The main applications of the lexicon are envisaged to be the semantic annotation and semantic IR for Bulgarian.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology-Based Lexicon of Bulgarian

In contrast to morphological and syntactic processing semantic annota tion based on domain ontology is still underdeveloped for Bulgarian. On the other hand, the prerequisites for an ontological annotation are already available. These are as follows: a morphosyntactic tagger for Bulgarian with more than 95% accuracy; a dependency parser with more than 84% accura cy; a general chunker and a na...

متن کامل

A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian

The paper presents a treebank-driven approach to the construction of a Bulgarian valence lexicon with ontological restrictions over the inner participants of the event. First, the underlying ideas behind the Bulgarian Ontology-based lexicon are outlined. Then, the extraction and manipulation of the valence frames is discussed with respect to the BulTreeBank annotation scheme and DOLCE ontology....

متن کامل

Bulgarian Language Resources for Ontology-Based Semantic Search

This paper presents the language resources, which would facilitate the ontology-based semantic search. Some of these resources are language independent, such as the domain ontology. Some depend on the specific language: terminological lexicons, annotation grammars, sense disambiguation rules, gold standard corpus. Here we focus on the Bulgarian resources constructed in two domains for supportin...

متن کامل

New Applications of “Ontology-to-Text Relation” Strategy for Bulgarian Language

The paper presents new applications of the Ontology-to-Text Relation Strategy to Bulgarian Iconographic Domain. First the strategy itself is discussed within the triple ontology-terminological lexicon-annotation grammars, then – the related works. Also, the specificics of the semantic annotation and evaluation over iconographic data are presented. A family of domain ontologies over the iconogra...

متن کامل

Shallow Semantic Annotation of Bulgarian

The paper discusses shallow semantic annotation of Bulgarian treebank. Our goal is to construct the next layer of linguistic interpretation over the morphological and syntactic layers that have already been encoded in the treebank. The annotation is called shallow because it encodes only the senses for the non-functional words and the relations between the semantic indices connected to them. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010